Clustering With Outlier Removal

نویسندگان

چکیده

Cluster analysis and outlier detection are two continuously rising topics in data mining area, which fact connect to each other deeply. structure is vulnerable outliers; inversely, outliers the points belonging none of any clusters. Unfortunately, most existing studies do not notice coupled relationship between these tasks handle them separately. In this article, we consider joint cluster problem, propose Clustering with Outlier Removal (COR) algorithm. Specifically, original space transformed into a binary via generating basic partitions. We employ Holoentropy measure compactness without involving several candidates. To provide neat efficient solution, an auxiliary matrix introduced so that COR completely efficiently solves challenging problem unified K-means— theoretical supports. Extensive experimental results on numerous sets various domains demonstrate effectiveness efficiency significantly over state-of-the-art methods terms validity detection. Some key factors including partition number generation strategy application abnormal flight trajectory further analyzed for practical use.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering with Outlier Removal

Cluster analysis and outlier detection are strongly coupled tasks in data mining area. Cluster structure can be easily destroyed by few outliers; on the contrary, the outliers are defined by the concept of cluster, which are recognized as the points belonging to none of the clusters. However, most existing studies handle them separately. In light of this, we consider the joint cluster analysis ...

متن کامل

DCBOR: A Density Clustering Based on Outlier Removal

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultane...

متن کامل

Improved Hybrid Clustering and Distance-based Technique for Outlier Removal

Outliers detection is a task that finds objects that are dissimilar or inconsistent with respect to the remaining data. It has many uses in applications like fraud detection, network intrusion detection and clinical diagnosis of diseases. Using clustering algorithms for outlier detection is a technique that is frequently used. The clustering algorithms consider outlier detection only to the poi...

متن کامل

Algorithms for optimal outlier removal

We consider the problem of removing c points from a set S of n points so that the remaining point set is optimal in some sense. Definitions of optimality we consider include having minimum diameter, having minimum area (perimeter) bounding box, having minimum area (perimeter) convex hull. For constant values of c, all our algorithms run in O(n log n) time.

متن کامل

Online Clustering and Outlier Detection

Clustering and outlier detection are important data mining areas. Online clustering and outlier detection generally work with continuous data streams generated at a rapid rate and have many practical applications, such as network instruction detection and online fraud detection. This chapter first reviews related background of online clustering and outlier detection. Then, an incremental cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering

سال: 2021

ISSN: ['1558-2191', '1041-4347', '2326-3865']

DOI: https://doi.org/10.1109/tkde.2019.2954317